Federated Entity Search Using On-the-Fly Consolidation

نویسندگان

  • Daniel M. Herzig
  • Peter Mika
  • Roi Blanco
  • Thanh Tran
چکیده

Nowadays, search on the Web goes beyond the retrieval of textual Web sites and increasingly takes advantage of the growing amount of structured data. Of particular interest is entity search, where the units of retrieval are structured entities instead of textual documents. These entities reside in different sources, which may provide only limited information about their content and are therefore called “uncooperative”. Further, these sources capture complementary but also redundant information about entities. In this environment of uncooperative data sources, we study the problem of federated entity search, where redundant information about entities is reduced on-the-fly through entity consolidation performed at query time. We propose a novel method for entity consolidation that is based on using language models and completely unsupervised, hence more suitable for this on-the-fly uncooperative setting than state-of-the-art methods that require training data. Further, we apply the same language model technique to deal with the federated search problem of ranking results returned from different sources. Particular novel are the mechanisms we propose to incorporate consolidation results into this ranking. We perform experiments using real Web queries and data sources. Our experiments show that our approach for federated entity search with on-the-fly consolidation improves upon the performance of a state-of-the-art preference aggregation baseline and also benefits from consolidation.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

One Query to Bind Them All

Recently, SPARQL became the standard language for querying RDF data on the Web. Like other formal query languages, it applies a Boolean-match semantics, i.e. results adhere strictly to the query. Thus, queries formulated for one dataset can not easily be reused for querying other datasets. If another target dataset is to be queried, the queries need to be rewritten using the vocabulary of the t...

متن کامل

Google vs. the Library: Student Preferences and Perceptions When Doing Research Using Google and a Federated Search Tool

Federated searching was once touted as the library world’s answer to Google, but ten years since federated searching technology’s inception, how does it actually compare? This study focuses on undergraduate student preferences and perceptions when doing research using both Google and a federated search tool. Students were asked about their preferences using each search tool and the perceived re...

متن کامل

Merging multiple information sources in federated sponsored search auctions

The recent increase of domain–specific search engines, able to discover information unknown by general–purpose search engines, leads to their federation into a single entity, called federated search engine. In this paper, we focus on how it can effectively merge sponsored search results, provided by the domain–specific search engines, into a unique list. In particular, we discuss the case in wh...

متن کامل

Resource Selection for Federated Search on the Web

A publicly available dataset for federated search reflecting a real web environment has long been absent, making it difficult for researchers to test the validity of their federated search algorithms for the web setting. We present several experiments and analyses on resource selection on the web using a recently released test collection containing the results from more than a hundred real sear...

متن کامل

Federated search Searching information across the AstraZeneca organisation

Finding information that is stored among many different databases has become a serious problem because of the increasing number of searchable databases on local area networks and on the Internet. Many large organisations in change, suffer from this problem due to a large number of databases and many different uncooperative search tools. By using federated search a single search interface provid...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013